Prediction of digital transformation of manufacturing industry based on interpretable machine learning

The enhancement of digital transformation is of paramount importance for business development. This study employs machine learning to establish a predictive model for digital transformation, investigates crucial factors that influence digital transformation, and proposes corresponding improvement strategies. Initially, four commonly used machine learning algorithms are compared, revealing that the Extreme tree classification (ETC) algorithm exhibits the most accurate prediction. Subsequently, through correlation analysis and recursive elimination, key features that impact digital transformation are selected resulting in the corresponding feature subset. Shapley Additive Explanation (SHAP) values are then employed to perform an interpretable analysis on the predictive model, elucidating the effects of each key feature on digital transformation and obtaining critical feature values. Lastly, informed by practical considerations, we propose a quantitative adjustment strategy to enhance the degree of digital transformation in enterprises, which provides guidance for digital development.


Introduction
In the context of economic globalization, the economic structure has undergone a shift from being primarily reliant on agriculture and industry to becoming increasingly driven by digitalization.In comparison to traditional economies, the digital economy leverages automation and digital tools to enhance efficiency and mitigate costs, enabling businesses to rapidly adapt to market fluctuations and emerging trends.Consequently, the effective development of the digital economy has emerged as a pivotal concern for enterprises [1].In the current era marked by the growing convergence of the digital economy and the physical economy, the digital transformation of enterprises has emerged as a pivotal catalyst for augmenting their competitive edge within the digital sphere [2], and it has garnered increasing attention as a research hotspot within the academic community [3].Presently, scholarly research concerning the digital transformation of enterprises primarily concentrates on comprehensively examining the effects of digitalization on pivotal aspects such as innovation capabilities [1,4], corporate value [5][6][7], capital markets [8] and sustainable development [9,10].Ciampi et al. [11] analyzed the impact of digital transformation on business innovation models by using data from 253 UK companies.Their research findings highlighted that digital transformation can enhance the innovation capabilities of firms by influencing their operational strategies and objectives.Similarly, Ferreira et al. [12] demonstrated that top management can leverage new digital decision-making processes to improve competitiveness through digital transformation.Meanwhile, Zhong et al. [9] investigated the effects of digital transformation on Environmental, Social, and Governance (ESG) factors, revealing a significant enhancement in ESG performance through digital transformation.However, these studies only demonstrate the benefits of digital transformation on firm development.As such, it is crucial to explore ways to enhance digital transformation capabilities at the firm level, thereby promoting digital development within enterprises.
Machine learning, an emerging technology in the field of computer science, empowers computers to acquire knowledge from data, accomplish model training, and perform classification and prediction on novel datasets based on previous acquired knowledge.Within the domain of economics, machine learning has witnessed extensive implementation [13][14][15].CraJa et al. [16] used deep learning to analyze the text in the company's year-end report, and more accurately judged whether the company had financial fraud through the context.Mark et al. [17] utilized machine learning algorithms to predict the presence of financial irregularities in Vietnamese listed companies.They recommend that regulatory bodies should intensify their focus on the financial statements of companies ranked lower in order to detect any potential anomalies.
Currently, China's manufacturing industry confronts the predicament of high input, high energy consumption, low efficiency, and low output.To address these challenges, digital transformation emerges as a viable solution, enabling companies to effectively mitigate such issues.Hence, this article adopts Chinese listed manufacturing companies as a case study to investigate the impact of various indicators on digital transformation through machine learning.The study further puts forth adjustment strategies aimed at bolstering the capability of digital transformation and accelerating the digitalization process.

Data set construction
This study aimed to investigate the Digital Transformation Capability (DCG) of listed companies in the A-share market of China's Shanghai and Shenzhen Stock Exchanges from the year 2014 to 2021.A comprehensive dataset was constructed by selecting companies from the CSMAR [18] and CRNDS [19] databases, resulting in 22,776 initial observational samples.CSMAR and CRNDS are an authoritative database for the Chinese securities market, providing multidimensional research and regulatory data.Specific data sets can be found in S1 Dataset.These samples represented a total of 5,072 listed companies, forming the basis of the study.In order to ensure data quality, missing values were excluded, and a tailing treatment was employed to manage outliers for continuous variables, using the 1st and 99th percentiles.Moreover, focusing specifically on the manufacturing industry, data from sectors C1, C2, C3, and C4 were extracted based on industry classification, resulting in a final dataset comprising 12,057 samples.The meanings of C1, C2, C3 and C4 will be provided in S1 File.
Digital Capability Generation (DCG), refers to the transformative process in which companies harness the potential of digital technology and the internet to revolutionize their business models, processes, organizational structures, and corporate cultures.The ultimate goal is to foster business innovation and optimize operational efficiency.Nevertheless, there is currently no standardized framework for quantitatively evaluating DCG initiatives.In this research endeavor, the application of Python web scraping techniques facilitated the extraction of pertinent keywords, including "artificial intelligence technology," "blockchain technology," "cloud computing technology," "big data technology," and "digital technology application," from the annual reports of companies within the designated dataset [12].The resulting keyword occurrences served as vital indicators to gauge the extent of DCG implementation across these enterprises.Moreover, to ensure a more uniform distribution of DCG data, a logarithmic transformation was applied to the word frequency measurements, as outlined in Formula (1).
Where N represents the number of occurrences of the keyword Fig 1 depicts the distribution of DCG within the dataset, revealing that more than 99% of the samples exhibit a DCG value lower than 5.This finding underscores the presence of a notable imbalance issue among the samples.To enhance the efficacy of subsequent predictions and bolster the model's generalization capacity, a threshold of 1.5 was utilized for DCG.Any DCG value below 1.5 indicated a lower level of digital transformation for the enterprise and correspondingly received a label of 0. Conversely, DCG values exceeding 1.5 denoted a higher degree of digital transformation, resulting in a label of 1.As a result, the final dataset comprised 6,280 samples labeled as 0 and 5,777 samples labeled as 1.

Feature engineering
To establish a predictive model for DCG, a range of enterprise information features were selected and presented in Table 1.Specifically, the features were categorized into financial and non-financial indicators.The former comprised a company's development ability, debt-paying ability, profitability, cash acquisition ability, and operational ability, whereas the latter consisted of corporate governance, equity structure, financing capability, company size, years since its establishment, and company value.

Machine learning model 2.3.1 Algorithm introduction.
Machine learning algorithms can be categorized into regression and classification based on the type of the target variable.This article focuses on predicting DCG values of 0 or 1, which classifies it under the classification algorithms.In classification algorithms, the model is trained using the features and labels of the sample data to predict unknown data.To assess the performance of different algorithm types on the dataset, this study selects extreme random trees (ETC) [20] as representative of the Bagging algorithm, gradient boosting machines (GBM) [21] as representative of the Boosting algorithm, support vector machines (SVM) [21] as representative of hyperplanes, Logistic regression (LOG), and Multi-layer perceptron (MLP) [22,23] as representative of neural networks.The principle of the specific algorithm will be provided in the S1 File.

Model validation and evaluation indicators.
In order to mitigate overfitting during the training process, this study utilizes cross-validation and holdout methods to establish the model.The dataset is randomly partitioned into a training set and a testing set in an 8:2 ratio.The model is trained on the training set, while the testing set is used to assess the model's generalization ability.To minimize the model's generalization error, a 5-fold cross-validation approach is employed during the training process [24].
In the context of classification models [25], particularly binary classification models, performance is often assessed using a confusion matrix.The confusion matrix displays the model's classification predictions and includes the following metrics: True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN), as outlined in Table 2.These metrics serve as the basis for computing evaluation metrics such as Accuracy, Precision, Recall, and F1 score, which are determined through Formulas ( 2)-( 5).
To evaluate the effectiveness of various classification models further, the Receiver Operating Characteristic (ROC) curve is used to compare their generalization capabilities [26,27].The ROC curve demonstrates the relationship between True Positive (TP) and False Positive (FP) rates for a specified classifier.The ROC curve plot displays the horizontal axis as the FP rate, while the vertical axis represents the TP rate.Each point on the ROC curve corresponds to the TP rate and FP rate for a particular threshold.A classifier with an ROC curve that is closer to the upper left corner signifies superior performance.A larger area under the ROC curve indicates better model performance.

Hyperparameter optimization.
After determining the best machine learning algorithm, optimizing the hyperparameters within the algorithm becomes necessary.Presently, common methods for hyperparameter optimization are grid search, random search, and Bayesian optimization [28,29].Grid search is a systematic approach to hyperparameter tuning.It involves defining the range of potential hyperparameter values to be explored, and then exhaustively evaluating all possible combinations of parameters using cross-validation or other evaluation metrics.Ultimately, the optimal hyperparameters are selected based on the highest observed performance.The main advantage of grid search is its ability to find the global optimum, but the downside is the high computational cost, particularly when dealing with a large number of parameters.Random search, on the other hand, is a method that randomly selects hyperparameters within a given range.By specifying the number of iterations or setting a stopping criterion, relatively good hyperparameters can be discovered within a specific time frame or number of iterations.Compared to grid search, random search is advantageous due to its simplicity and computational efficiency.However, since the parameters are selected randomly, there is no guarantee of finding the global optimum.Bayesian Optimization is a sequential model optimization method based on Bayesian inference [30][31][32].It is used to optimize the input parameters of a black-box function.Compared to grid search and random search, Bayesian Optimization can efficiently identify the optimal parameters.The fundamental concept of Bayesian Optimization is to establish a prior model, which estimates the parameter performance using observed parameters and their corresponding function values.Generally, the prior model assumes that the function values follow a Gaussian process and updates the parameter model through Bayesian inference.In each iteration, the next parameter with the highest likelihood of achieving better performance is selected for evaluation based on the predicted results of the parameter model.By repeating this process, Bayesian Optimization gradually converges to the global optimum.Considering the large dataset and complex range of hyperparameters in this paper, Bayesian Optimization is chosen as the approach for hyperparameter optimization.Accuracy is used as the evaluation metric, and the best hyperparameter combination is obtained through iterative optimization.Table 3 presents the hyperparameters obtained for the four machine learning models after applying Bayesian Optimization.the four machine learning models are displayed in Fig 2 (B).It is evident that the ROC curve of the ETC model almost entirely encompasses the curves of the other three models, depicting its superior performance.Furthermore, the ETC model demonstrates the largest AUC area of 0.82 among all the models, further affirming its optimal predictive accuracy.Consequently, the subsequent analysis will be carried out utilizing the ETC model.

Feature screening
Since the feature redundancy was not taken into account during the feature selection process, it is necessary to refine the dataset after selecting the optimal algorithm.Fig 3 presents the heatmap of feature correlations.When two features have a correlation greater than 0.95 [33,34], it suggests a high degree of correlation, and one of the features may be removed.In Fig 6, the correlation coefficient between Size and SA is 0.99, indicating that Size can be removed, leaving 19 remaining features.Recursive Feature Elimination (RFE) and Exhaustive Feature Selection (EFS) are further implemented for feature selection.RFE is an embedded feature selection approach, which trains a model on the original dataset and iteratively eliminates lowweight features to yield a feature subset.Specifically, the method trains a model on the original dataset, ranks the features based on their weights.Then, it iteratively removes the feature with the smallest weight and trains a model on the remaining features.
Based on the models' performance in each iteration, the optimal feature subset and its corresponding model were chosen.The process of feature selection using the recursive elimination method is depicted in Fig 4(A).As the number of features in the subset increased, the model's accuracy continually improved, reaching 0.756 when there were 8 features.Beyond 8 features, no further improvement was observed.Thus, 8 was identified as the best feature subset, including RDpr, RDeapoin, Lev, ATO, Top1, Balance1, Mshare, and SA.The exhaustive method exhaustively explores all feature combinations within the subset and evaluates the model's performance for each combination.Fig 4(B) illustrates the feature selection process utilizing the EFS.Among all feature combinations, the combination of 8 features achieved the highest accuracy, confirming RDpr, RDeapoin, Lev, ATO, Top1, Balance1, Mshare, and SA as critical features.

Interpretability analysis
Although machine learning has high prediction accuracy, its inherent prediction process is still invisible and belongs to the field of black box model.To enhance the interpretability of models, this study introduces Shapley Additive Explanation (SHAP) values for analysis [30].SHAP values serve as a method for elucidating the significance of features within predictive models.In the context of machine learning, SHAP values provide insight into the relative contributions that each feature makes to the model's predictions.This is achieved through the permutation and combination of input features, assigning them weights according to their influence on the predicted outcome, thereby producing SHAP values for individual features.Analyzing these SHAP values facilitates the extraction of valuable information about feature importance rankings and feature interactions within the model, thereby aiding in our understanding and explication of the decision-making processes employed by the model.The specific explanation of SHAP value is in S1 File.and investment supports the organization in undertaking more technological innovation and product development.This, in turn, helps companies maintain a technological advantage and introduce more competitive digital products and services.Moreover, research and development personnel and expenses play a pivotal role in safety and risk management.A greater proportion of research and development personnel and spending enables companies to develop and implement comprehensive security strategies, guaranteeing data and system safety throughout the digital transformation process [36,37].SA represents the financial constraints of the enterprise.A larger SA indicates potential funding shortages for the company.Digital transformation, on the other hand, necessitates substantial investments, particularly in technology research and development, data analysis, and automation, which demand significant resources and funding [38].When confronted with financial constraints, the company's digital transformation initiatives may face limitations, impeding the pace and effectiveness of the transformation [39].Fig 6(D) depicts the distribution of SHAP values corresponding to Lev.In the range of 0.3 < Lev < 0.7, the majority of SHAP values for the samples are positive, indicating beneficial conditions for the enterprise's digital transformation.Conversely, negative values are detrimental to the digital transformation of the enterprise.Lev, representing the leverage ratio of an enterprise, has a dual impact on digital transformation.Firstly, digital transformation entails significant financial investments.In the case of a low debt ratio, the company may encounter a shortage of initial funding, impeding the smooth execution of digital transformation plans [40].Conversely, companies with a high debt ratio must address management contracts and debt repayment concerns.Consequently, the scale of investment in digital transformation may be constrained, leading to adverse effects on the speed and efficacy of the transformation [41].Further, companies with a high debt ratio tend to prioritize cost control, resulting in relatively limited investment in digital transformation and innovation.This, in turn, influences their digital capabilities and competitiveness.Conversely, companies with a moderate debt ratio emphasize management efficiency by effectively distributing resources and assets, thereby achieving a more efficient digital transformation.Balance1, respectively.When Top1 is less than 0.2 or greater than 0.6, SHAP values tend to be positive, which is advantageous for digital transformation.Top1 represents the shareholding proportion of the foremost stakeholder in the company.A higher shareholding proportion for the primary stakeholder typically signifies greater capacity for resource investment, encompassing both funding and technological resources.Consequently, the primary stakeholder can allocate resources more flexibly to support the indispensable technological and equipment investments required for digital transformation.Moreover, a higher proportion of shares held by the primary stakeholder generally implies a stronger influence on corporate decision-making.In the context of digital transformation, a series of strategic decisions and transformative measures must be undertaken [42].A higher proportion of shares held by the largest shareholder ensures a greater influence and control over decision-making relating to digital transformation, thereby facilitating a smooth transition.Conversely, a lower proportion of shares held by the largest shareholder may prompt the company to prioritize effective management mechanisms and processes.To attract investment and support from other shareholders, the organization may strengthen internal management and enhance efficiency.This approach is advantageous for driving digital transformation as it typically requires efficient organization and processes.Simultaneously, a smaller proportion of shares held by the largest shareholder may suggest a more open attitude towards external investments.Attracting external investment can bring fresh funding, technology, and market resources to the company, thus expediting the progress of digital transformation.The concept of Mshare, which represents the ownership percentage of shares by management, exerts a multifaceted influence on the digital transformation of enterprises.On the one hand, a significant hold share by management empowers them to enhance their sway and control over corporate decision-making.This may foster a proactive approach to driving decisions and executing digital transformation initiatives, thereby expediting decision-making and improving execution efficiency.On the other hand, an elevated management hold share may lead to an immoderate concentration of power within the organization, consequently creating a deficiency in effective oversight and counterbalancing mechanisms.Such circumstances potentially pave the way for excessive managerial centralization, amplifying the shortage of appropriate feedback and constraints, posing a risk to the quality and efficacy of digital transformation decision-making [43].The term Balance1 refers to the degree of equity balance in a company.An excessively high degree of equity balance suggests that power is dispersed among multiple shareholders, each with their individual opinions and vested interests.This situation can potentially lead to slow and intricate decision-making processes that require more negotiations and compromises, thereby impeding the progress of digital transformation projects.Conversely, a deficiency of equity balance can result in the interests of other shareholders or stakeholders being neglected [44,45], which may prompt decision-making and the implementation of digital transformation initiatives without fully considering the interests and objectives of the whole company.Consequently, this may weaken the overall competitiveness and sustainability of the company in the long run.Fig 6(H) demonstrates the SHAP value distribution of ATO, indicating that ATO > 0.7 is advantageous for a company's digital transformation.ATO represents the total asset turnover ratio, an important metric for assessing a company's efficiency in asset operation.It reflects the relationship between sales generated through operational activities and the total assets of the company within a specific time period.A higher total asset turnover ratio signifies greater efficiency in asset operation, enabling the company to utilize assets more effectively to drive sales [46].Considering that digital transformation typically requires substantial investment, a high turnover ratio can free up and redirect funds, providing additional resources to support the technological and equipment investments needed for digital transformation [47].
Moreover, a high total asset turnover ratio also necessitates the company's ability to quickly adapt to market demand changes and flexibly adjust production and sales strategies.Digital transformation can provide more comprehensive data and analysis resources, enabling the company to make more precise market forecasts and informed decisions.By enhancing a company's innovative capabilities and competitiveness through digital transformation, the total asset turnover ratio can be further improved.

Improve quantitative adjustment strategies for digital transformation
Based on our predictive model and interpretability analysis, we propose a quantitative strategy to enhance companies' digital transformation [48].Among the eight key features, namely RDeapoinr, Lev, and ATO, adjusting the latter three is relatively easier compared to the remaining features, which are relatively fixed.Therefore, in practical implementation, while maintaining the other features unchanged, adjustments can be made to RDeapoinr, Lev, and ATO in order to transform companies originally labeled as 0 to label 1.
In the process of customizing the subsequent development direction, companies need to consider not only the predictive results of the model but also additional factors like cost and efficiency.Consequently, we propose an adjustment model, represented by Formulas (6)~( 9).More specifically, we constrain the values of RDeapoinr, Lev, and ATO to fall within the critical value range proposed in section 3.3, which is defined as 0.05 < RDeapoinr < 0.4, 0.3 < Lev < 0.7, and 0.7 < ATO < 2.5, while the remaining features remain fixed.These specific values are then substituted into predictive model (6) to derive the corresponding predicted labels.We select the set of predicted labels with a value of 1, and based on the constraints outlined in ( 7), (8), and (9), we obtain the final adjusted feature values.

P ¼ ModelðRDeapoinr; Lev; ATO; FÞ
Where P represents the predicted label, Model represents the established predictive model, RDeapoinr, Lev, and ATO correspond to the respective features, F represents the remaining unchanged features, and RDeapoinr*, Lev*, and ATO* represent the adjusted feature values.This paper presents a case study of two listed companies with stock codes 000004 and 000017 in 2016, aimed at demonstrating how to enhance digital transformation through adjusted strategies, as illustrated in Table 4.The company with stock code 000004 started with RDeapoinr, Lev, and ATO values near their respective critical points.Subsequent feature adjustments led to increases in these metrics, with the corresponding label changing from 0 to

Conclusion
This paper aims to predict the degree of digital transformation in the manufacturing industry using machine learning techniques.It compares four machine learning algorithms and evaluates their performance based on several metrics.The ETC algorithm achieved the highest predictive accuracy in the test set, with an accuracy of 0.74, F1 score of 0.72, recall of 0.67, and precision of 0.75.To identify the most relevant features, the paper conducted correlation analysis and recursive feature elimination.It found that the optimal feature subset includes Proportion of R&D personnel, The proportion of R&D expenditure to operating income, Assetliability ratio, Turnover of total assets, The proportion of the largest shareholder, Equity balance degree, Management shareholding ratio, and Financing constraint.Additionally, SHAP values were used to analyze interpretability and determine the ranking of feature importance.The most crucial features identified were Proportion of R&D personnel, Financing constraint, The proportion of R&D expenditure to operating income, Asset-liability ratio, The proportion of the largest shareholder, Management shareholding ratio, Equity balance degree, and Turnover of total assets.From a corporate development perspective, it is imperative to increase the proportion of R&D personnel and the ratio of R&D expenses to operating income.R&D personnel typically possess technical expertise and innovation capabilities, enabling companies to adeptly address technological challenges in digital transformation.Firms with a higher ratio of R&D expenses to operating income tend to prioritize technological innovation and invest significantly in research and development, thereby contributing to the sustenance of a competitive advantage in digital transformation.The proportions of R&D personnel and R&D expenses to operating income wield a substantial impact on a company's digital transformation.Accordingly, companies should judiciously allocate the proportion of R&D personnel and the ratio of R&D expenses to operating income based on their unique circumstances to achieve a successful digital transformation.From the standpoint of the company's debt-paying ability, there is a need to curtail the asset-liability ratio.The asset-liability ratio serves as a pivotal indicator of a company's financial health, reflecting the intricate relationship between its assets and liabilities.A diminished asset-liability ratio signifies reduced financial risk, potentially aiding the company in better navigating challenges in digital transformation.In terms of operational capability, the company should strive to enhance its capital turnover rate.The capital turnover rate stands as a vital financial management metric, indicative of the efficiency of a company's capital utilization and debt-paying ability.A heightened capital turnover rate signifies superior efficiency in capital utilization, thereby assisting the company in effectively addressing challenges in digital transformation.Concerning the company's equity structure, the impact of the proportion of the largest shareholder's holdings, management's shareholding ratio, and equity balance on digital transformation is intricate.Companies should comprehensively consider various factors and formulate the optimal digital transformation strategy based on their individual circumstances.From a financing capability perspective, companies should alleviate financing constraints to expedite the pace and efficacy of digital transformation.This suggests that these conditions are beneficial for digital transformation in enterprises.Based on the findings, the paper proposes a quantitative adjustment strategy for digital transformation, taking into account the actual conditions of the enterprise.This strategy aims to promote the progress of digital development in the manufacturing industry.

Fig 2
Fig 2 illustrates the performance of the four machine learning models on the test dataset.In Fig 2(A), an evaluation metrics comparison is presented for these models.Notably, the ETC and GBM models outperform the SVM and MLP models.The ETC model achieves an accuracy of 0.74, an F1 score of 0.72, a recall of 0.67, and a precision of 0.75.The ROC curves for

Fig 6 (
B) illustrates that when SA > 2.5, the SHAP values for the samples are predominantly negative, which hampers an enterprise's digital transformation progress.

Table 1 . Feature name and description. Features Type Name Symbol Description
(Market value of tradable shares + number of non-tradable shares × net assets per share + book value of liabilities)/ total assets https://doi.org/10.1371/journal.pone.0299147.t001

Table 4 . Comparison of raw data with adjusted data.
In contrast, the company with stock code 000017 started with higher Lev and ATO values, but low RDeapoinr values.By increasing the proportion of RDeapoinr and lowering Lev and ATO levels based on the company's actual conditions, the label was eventually transformed.(The adjusted values given are not absolute, but only provide reference for improvement). https://doi.org/10.1371/journal.pone.0299147.t0041.